Adding metric of time since last CodeOSS version update in update automation workflow#56
Conversation
…ode-editorv2 into alarm-no-update
| name: Publish Success Metrics | ||
| runs-on: ubuntu-latest | ||
| needs: [update-automation, build-and-update-package-locks, generate-oss-attribution, create-pr, send-notification] | ||
| needs: [update-automation, build-and-update-package-locks, generate-oss-attribution, create-pr, send-notification, publish-release-lag-metric] |
There was a problem hiding this comment.
will publish-release-lag-metric execute in parallel with other actions?
There was a problem hiding this comment.
Yes, to make sure it is run before deciding success and failure metric
| aws-region: us-east-1 | ||
|
|
||
| - name: Calculate and publish release lag metric | ||
| if: steps.aws-creds.outcome == 'success' |
There was a problem hiding this comment.
what if this condition is not meet, how can we detect it? will it be covered by automation update missing alarm?
There was a problem hiding this comment.
Yes, in the case if metric is missing alarm will also be triggered
treatMissingData: TreatMissingData.BREACHING
|
|
||
| CURRENT_TIMESTAMP=$(date +%s) | ||
| SECONDS_BEHIND=$((CURRENT_TIMESTAMP - SUBMODULE_COMMIT_TIMESTAMP)) | ||
| NORMALIZED_VALUE=$(awk "BEGIN {printf \"%.6f\", $SECONDS_BEHIND / 2592000}") |
There was a problem hiding this comment.
I suggest to make it more granular by normalize to days instead of month in case we will update alarm threshold, also we may need to show this metric in the dashboard, days is easier to show. It also means we need to update alarm threshold to 30 in the gitfarm CR.
There was a problem hiding this comment.
No problem done
There was a problem hiding this comment.
did u publish? I didn't see an update
There was a problem hiding this comment.
Now changes are pushed!
Issue #, if available:
Because Automation Update workflow makes assumptions on CodeOSS version update, it's possible that CodeEditor version is behind CodeOSS for longer than 1 month while Github Action still publishes success metric to CloudWatch during execution, which leads to undetected update problem.
Description of changes:
New metric CodeOSSReleaseLag is published to calculate second elapsed since last update, monitored by timestamp of last commit in third-party-src in repository. The value is normalized to month (so 1.0 means 1 month)
By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.